Top-k Query Processing in the APPA P2P System
نویسندگان
چکیده
Top-k queries are attractive for users in P2P systems with very large numbers of peers but difficult to support efficiently. In this paper, we propose a fully distributed algorithm for executing Top-k queries in the context of the APPA (Atlas Peer-to-Peer Architecture) data management system. APPA has a network-independent architecture that can be implemented over various P2P networks. Our algorithm requires no global information, does not depend on the existence of certain peers and its bandwidth cost is low. We validated our algorithm through implementation over a 64-node cluster and simulation using the BRITE topology generator and SimJava. Our performance evaluation shows that our algorithm has logarithmic scale up and improves Top-k query response time very well using P2P parallelism in comparison with baseline algorithms.
منابع مشابه
Replication and Query Processing in the APPA Data Management System
Advanced P2P applications are likely to need general replication capabilities such as variable granularity and multi-master mode. However, existing replication solutions do not address important properties of P2P systems such as self-organization. In this paper, we address replication and query processing in the context of the APPA (Atlas Peer-to-Peer Architecture) data management system. APPA ...
متن کاملTraitement de Requêtes Top-k dans les Communautés Virtuelles P2P de Partage de Données. (Top-k Query Processing in P2P Data Sharing Virtual Communities)
Top-k queries have two main advantages for peer-to-peer (P2P) data sharing virtual communities. First, they allow participants to rank the results for their queries based on the existing data in the system as well as on their own preferences. Second, they avoid overwhelming participants with too many results. However, existing top-k query processing techniques for P2P systems make users suffer ...
متن کاملSemantic Query Routing and Distributed Top-k Query Processing in Peer-to-Peer Networks
Requirements for widely distributed information systems supporting virtual organizations have given rise to a new category of peer-to-peer (p2p) systems called schema-based. In such systems each peer is a database management system in itself, exposing its own schema. In such a setting, a main objective is the efficient search across peer databases by processing each incoming query without overl...
متن کاملEfficient Early Top-k Query Processing in Overloaded P2P Systems
Top-k query processing in P2P systems has focused on efficiently computing the top-k results while reducing network traffic and query response time. However, in overloaded P2P systems (with very high query loads), some peers may take a long time to answer, thus making the user wait a long time to obtain the final top-k result. In this paper, we address this problem, which we reformulate as earl...
متن کاملData Management in the APPA P2P System1
Peer-to-peer (P2P) computing offers new opportunities for building highly distributed data systems. Unlike client-server computing, P2P is a very dynamic environment where peers can join and leave the network at any time and offers important advantages such as operation without central coordination, peers autonomy, and scale up to large number of peers. However, providing high-level data manage...
متن کامل